Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[UR][Bindless] Initial implementation of bindless images for HIP #2496

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

GeorgeWeb
Copy link
Contributor

@GeorgeWeb GeorgeWeb commented Dec 20, 2024

This PR implements the entry points in the HIP adapter for the Bindless Images extension.

Some features, while implemented, only partially pass the End-to-end tests for them in DPC++ (the PR that enables the support and testing in DPC++: intel/llvm#16439). This is because the HIP runtime doesn't support, for example, some parameter combinations with some image memory types. The not-fully supported features are marked with TODOs in the device info queries for further work.

Signed-off-by: Georgi Mirazchiyski [email protected]

Co-authored-by: Peter Žužek [email protected]

@github-actions github-actions bot added images UR images hip HIP adapter specific issues labels Dec 20, 2024
@GeorgeWeb GeorgeWeb force-pushed the georgi/bindless-hip branch 2 times, most recently from d7d669e to a89b779 Compare December 20, 2024 13:52
TODO: Complete the description with more information.

Signed-off-by: Georgi Mirazchiyski <[email protected]>

Co-authored-by: Peter Žužek <[email protected]>
@GeorgeWeb GeorgeWeb marked this pull request as ready for review December 23, 2024 15:17
@GeorgeWeb GeorgeWeb requested review from a team as code owners December 23, 2024 15:17
@GeorgeWeb GeorgeWeb requested a review from ldrumm December 23, 2024 15:17
@GeorgeWeb
Copy link
Contributor Author

GeorgeWeb commented Jan 15, 2025

@oneapi-src/unified-runtime-hip-write , @oneapi-src/unified-runtime-bindless-images-write Heyo, that's a whole lot of changes and additions, so I am just pinging as a reminder in case it was lost in messages over the Winter holidays. There is no rush in reviewing immediately, I just wanted to remind about it. Thank you!

Copy link
Contributor

@frasercrmck frasercrmck left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can't say I was able to give the most thorough review with respect to specific HIP APIs or all of the ins and outs of the image properties and logic concerning rows and pitches and so on.

return static_cast<ur_sampler_addressing_mode_t>((Props >> 2) & 0b111);
}

ur_sampler_addressing_mode_t getAddressingModeDim(size_t i) const noexcept {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Assert i < 3?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good eye! Sure.

@@ -88,6 +95,13 @@ struct ur_device_handle_t_ {
int getConcurrentManagedAccess() const noexcept {
return ConcurrentManagedAccess;
};

bool supportsHardwareImages() const noexcept {
return HardwareImageSupport ? true : false;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks a little clunky. I see that hipDeviceAttributeImageSupport returns ‘1’ if Device supports image, ‘0’ otherwise. (source).

Perhaps we could store HardwareImageSupport as a boolean, and format the return value of hipDeviceGetAttribute once?

    // Check if texture functions are supported in the HIP host runtime.
    int ret;
    UR_CHECK_ERROR(hipDeviceGetAttribute(
        &ret, hipDeviceAttributeImageSupport, HIPDevice));
    detail::ur::assertion(ret == 0 || ret == 1);
    /*(bool)*/ HardwareImageSupport = ret == 1;

? Or just return HardwareImageSupport == 1 here I guess.

Copy link
Contributor Author

@GeorgeWeb GeorgeWeb Jan 15, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, there's another one (attribute member) stored as integer but used as boolean too here. Anyways, since this one is going to be used as boolean in any situation I can imagine, it may make more sense to just store as boolean. I'll do that. Thanks.

}

// Used for bookkeeping for mipmapped array leaks in mapping external memory.
std::map<hipArray_t, hipMipmappedArray_t> ChildHipArrayFromMipmapMap;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is the order important? Would a std::unordered_map bring any benefits (it's usually faster)?

Copy link
Contributor Author

@GeorgeWeb GeorgeWeb Jan 15, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's a fair question. I don't know the answer to this, as it is a change I was asked to adapt from the Cuda backend implementation that fixes a race/leak. I will double check and come back to it. Hence, I didn't change the data structure. I will double check with the team to see if the order is important and if not I'll switch to std::unordered_map.

return ReturnValue(ur_bool_t{false});
}
case UR_DEVICE_INFO_BINDLESS_SAMPLED_IMAGE_FETCH_1D_USM_EXP: {
// HIP does support fetching 1D USM sampled image data.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This may be a personal preference but I prefer supports (as in the comment(s) above) to does support (below). I initially parsed it as does not support.

}
case UR_DEVICE_INFO_BINDLESS_SAMPLED_IMAGE_FETCH_1D_EXP: {
// HIP does not support fetching 1D non-USM sampled image data.
// TODO: DPC++ doesn't implement the required builtins for SYCL.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a reason to have a TODO here if HIP doesn't support it anyway?


if (memType == hipMemoryTypeArray) {
// HIP doesn not provide async copies between host and image arrays
// memory in versions early than 6.2.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// memory in versions early than 6.2.
// memory in versions earlier than 6.2.


if (memType == hipMemoryTypeArray) {
// HIP doesn not provide async copies between image arrays and host
// memory in versions early than 6.2.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// memory in versions early than 6.2.
// memory in versions earlier than 6.2.

cpy_desc.dstY = pCopyRegion->dstOffset.y;
cpy_desc.dstMemoryType = hipMemoryTypeHost;
cpy_desc.dstHost = pDst;
if (pSrcImageDesc->rowPitch == 0) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it okay that cpy_desc.srcPitch is uninitialized in this path? I might have missed similar instances of this question before/after.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So I think here we want it (srcPitch) zero-initialised. I thought it would be init to that, originally, from the looking at the hip_Memcpy2D cpy_desc = {}; construction (another one taken from the Cuda adapter counterpart). That was on the idea that, assuming that's a POD type, it's all zero-init. However, if my understanding is wrong it may be better to do it explictly.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In cuda CU_MEMORYTYPE_ARRAY, which is equivalent to hipMemoryTypeArray, arrays are opaque memory and do not have (at least exposed) notion of pitch.

So it is not set as it is not expected for the pitch field to be used at all.

Copy link
Contributor Author

@GeorgeWeb GeorgeWeb Jan 17, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, after double checking this looks correct. Actually, thank you for clarifying this @DBDuncan!

I think Fraser's concern was also a little more general here as well - that we might end up with a garbage value if left unitialised (which isn't good practice) - ideally we want it zero-init in such cases. Luckily, since C++11 with the brace-init syntax via the uniform init-brackets {} that are used for cpy_desc here (considering hip_Memcpy2D is a POD or trivially default-constructible), the members that are not explicitly initialised should get zero-init (or nullptr for pointers).

extMemDesc.type = hipExternalMemoryHandleTypeOpaqueWin32;
break;
case UR_EXP_EXTERNAL_MEM_TYPE_WIN32_NT_DX12_RESOURCE:
// Memory descriptor flag values such as hipExternalMemoryDedicatedare
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// Memory descriptor flag values such as hipExternalMemoryDedicatedare
// Memory descriptor flag values such as hipExternalMemoryDedicated are

case UR_EXP_EXTERNAL_SEMAPHORE_TYPE_WIN32_NT_DX12_FENCE:
extSemDesc.type = hipExternalSemaphoreHandleTypeD3D12Fence;
break;
case UR_EXP_EXTERNAL_SEMAPHORE_TYPE_OPAQUE_FD:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if this case falling through to default would be better served with a [[fallthrough]] annotation. Or left out? There's at least one more case of this further up if so.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Makes sense, I'll prefer a [[fallthrough]] annotation.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
hip HIP adapter specific issues images UR images
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants